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An analytical approach to calculating bond percolation thresholds, sizes of fe-cores, and sizes of 
giant connected components on structured random networks with non-zero clustering is presented. 
The networks are generated using a generalization of Trapman's [P. Trapman, Theor. Pop. Biol. 
71, 160 (2007)] model of cliques embedded in tree-like random graphs. The resulting networks have 
arbitrary degree distributions and tunable degree-dependent clustering. The effect of clustering 
on the bond percolation thresholds for networks of this type is examined and contrasted with 
some recent results in the literature. For very high levels of clustering the percolation threshold 
in these generalized Trapman networks is increased above the value it takes in a randomly-wired 
(unclustered) network of the same degree distribution. In assortative scale-free networks, where the 
variance of the degree distribution is infinite, this clustering effect can lead to a non-zero percolation 
(epidemic) threshold. 
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I. INTRODUCTION 

There has been considerable recent interest in the 
study of random network models, with a view to un- 
derstanding the structure and dynamics of the Internet, 
citation networks, and other social, biological and tech- 
nological networks; see the reviews [H, S S 13 and refer- 
ences therein. The degree distribution is a fundamen- 
tal quantity of interest in these studies; here Pk is defined 
as the probability that a randomly chosen node (vertex) 
in the network has k neighbors. Random networks with 
a specified Pk may be generated using the so-called con- 
figuration model [5] , which randomly links pairs of nodes 
to give the correct degree distribution. The properties of 
networks generated in this manner are now well under- 
stood, with analytical results relying on the fact that such 
networks can be approximated very accurately by tree- 
like graphs (provided that Pk decays sufficiently rapidly 
for large A: [aSQ). 

However, most real-world networks are not tree-like, 
since the density of cycles (loops) of length three in such 
networks is non-zero, whereas this quantity vanishes (in 
the limit of infinite network size) for the configuration 
model. The local clustering coefficient for a node A is 
defined as the fraction of pairs of neighbors of node A 
which are also neighbors of each other Q. The degree- 
dependent clustering Ck is the average of the local clus- 
tering coefficient over the class of all nodes of degree 
k [1, [l3| . Because analytical results are difficult to obtain 
for networks containing loops, the question of how models 
incorporating both Pk and non-zero Ck (taken, for exam- 
ple, from real-world network data) differ in structure and 
dynamics from corresponding randomly-wired networks 
(where Ck — > 0) remains of considerable interest. 

The bond percolation problem on networks depends 
strongly on the structure of the underlying graph, and 
also has several important applications. The problem 
may be stated as follows: each edge of the network graph 
is visited once, and damaged (deleted) with probability 
1 — p (the quantity p is the bond occupation probability). 



The size of the giant connected component (GCC) of the 
graph is clearly zero for p = but becomes nonzero at 
some critical value of p > 0: this critical value of p is 
termed the bond percolation threshold Pc- The bond per- 
colation problem has applications in epidemiology, where 
p is related to the average transmissibility of a disease and 
the GCC represents the size of an epidemic outbreak, and 
in the analysis of technological networks, where the re- 
silience of a network to the random failure of links is 
quantified by the size of the GCC [ll|. The percolation 
threshold and the GCC size may be determined analyti- 
cally for configuration model networks p^ . 

A number of investigations into the effects of clustering 
on bond percolation have also been undertaken. New- 
man p^ introduced a bipartite graph model of highly 
clustered networks, and examined an example of a net- 
work in which the existence of clustering decreases the 
percolation threshold from its value in an unclustered 
network, see also [l^. Serrano and Boguha [1, |Tl|, [Tsj 
make a detailed analysis of the interdependence of clus- 
tering and correlations. They distinguish between two 
types of clustered networks: those with average clus- 
tering Cfc of fc-degree nodes less than l/(fc — 1), termed 
weakly clustered, and those with Ck > l/(fc — 1), termed 
strongly clustered. The boundary Ck = l/(fc — 1) repre- 
sents the largest value of clustering achievable without in- 
ducing degree-degree correlations in the network. Using 
approximate analytical methods for the weak clustering 
cases and numerical simulations [ll| for some strongly 
clustered networks, they compare the bond percolation 
threshold to the value it would have for an unclustered 
network with the same degree distribution. Their general 
conclusion is that weak clustering increases the percola- 
tion threshold above its unclustered value, while strong 
clustering decreases the threshold. The latter conclusion 
is consistent with the example examined by Newman [isj . 
On the other hand, it has been pointed out in the epi- 
demiological literature [l^, that in clustered networks 
infection tends to be confined within highly connected 
groups, and so sufficient clustering should increase the 



epidemic (percolation) threshold. 

Trapman [H, [l^ recently introduced a model of clus- 
tering in structured graphs based on embedding cliques 
(complete subgraphs) within a random tree structure. 
He uses this model to analytically determine epidemic 
thresholds on networks with non-zero clustering. In 
Trapman's model networks, the degree-dependent clus- 
tering Ck is of the form Ck oc {k — 2)/k for all fc > 3. In 
particular, Ck increases with increasing degree fc, which is 
contrary to the typically decreasing behavior ^ for 
large k seen in real- world networks 20, 2l[. In this paper 
we generalize the Trapman construction to allow for more 
general Ck dependence on k (see equation Q below) , with 
a view to matching to the degree-dependent clustering of 
real- world networks. As shown in section [TTTl this gener- 
alization leads to clustered networks in which the bond 
percolation threshold may be either larger or smaller than 
the threshold in a randomly-wired (configuration model) 
network with the same degree distribution P^. Further- 
more, we develop methods from to give analytical 
results for the GCC (epidemic) size on clustered net- 
works. We also demonstrate the adaptability of these 
methods by calculating the sizes of /c-cores on clustered 
networks. The fc-core of a network is the largest sub- 
graph whose nodes have degree at least k [2^ . [2J] ; study 
of fc-core decompositions gives insights into the topol- 
ogy of interconnected parts of real-world networks such 
as the Internet [l^. Analytical results for fc-core sizes 
have been found for configuration model networks (26l | 
and on tree-like random graphs with degree-degree cor- 
relations [22| , but both these cases assume zero clustering 
in the network. Very recently, alternative models for ran- 
dom graphs with clustering have been published [l^, [1^ , 
but these examine only the bond percolation problem. 

The layout of the paper is as follows. The general- 
ization of Trapman's algorithm for generating clustered 
networks is described in section [Hi In section iHll we ex- 
amine the transition point for bond percolation on such 
clustered networks, and show that clustering may either 
increase or decrease the epidemic threshold. Compar- 
isons are drawn with results using data for some real- 
world networks. Section IIVI describes an analytical ap- 
proach to calculating the size of the giant connected com- 
ponent (the epidemic size), and the method is extended 
in section|V]to yield fc-core sizes. Finally, conclusions are 
drawn in section IVII 

II. GENERATING THE CLUSTERED 
NETWORK 

Here we describe an algorithm based on that of [l^ [l^ 
which generates structured random networks with arbi- 
trary degree distributions and with high clustering. 
The algorithm can be written in three steps, as follows: 

(i) An uncorrelated random network is created using 
the configuration model in the standard way (con- 
necting stubs at random). This network, which 
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FIG. 1: (Color online) (a) Graph of super-individuals which 
consists of two household nodes and six bachelor nodes, (b) 
Graph of individuals which is generated from (a) by expand- 
ing households into fc-chques of individual nodes. 

we call the super-graph, has a finite-variance de- 
gree distribution P^, related to the desired distri- 
bution Pk of the final network by equation ([3]) be- 
low. The nodes of this super-graph are called super- 
individuals. 

(ii) A fraction gk of all fc-degree super-individuals (for 
A; > 3) are tagged as households. This tagging does 
not affect the random linking of the configuration 
model in any way, but is used in the next step 
of the algorithm. The untagged super-individuals 
will be referred to as bachelors. Figure [l]^a) shows 
an example of a super-graph, with two households 
(drawn as larger nodes) and six bachelors. 

(iii) Taking the tagged super-graph of step (ii) as input, 
we generate the individuals graph, in which each 
node represents a single individual. Each super- 
individual (of degree fc say) which is tagged as a 
household is expanded into a fc-clique of individual 
nodes. Thus each household in the super-graph is 
replaced in the individuals graph by fc individuals 
of degree fc, all of whom are linked to each other, 
and each of which has one neighbor outside his own 
household (see Figure [Ijb)). Each bachelor in the 
super-graph becomes an individual in the individ- 
uals graph. When all super-individuals have been 
replaced in this way we have generated the indi- 
viduals graph with degree distribution Pk and the 
algorithm concludes. 

Let N be the total number of super-individuals in the 
super-graph of step (i). When N is sufficiently large, 
there are approximately NPk super-individuals of degree 
fc in the network. The bachelors among these become 
NPk{l—gk) individual nodes of degree fc, while the house- 
holds of degree fc are expanded to NPkgkk individuals 
grouped into fc-cliques. Letting N denote the total num- 
ber of individuals, we sum over all degree classes to obtain 
the relation 

N ^NY,Pk{l-gk + kgk). (1) 

k 

Note that taking the limit N oo therefore implies N 
oo, and vice versa. 
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It is convenient to introduce the fraction fk of fc-degree 
nodes in the individuals graph which are members of a 
fc-chque. This fraction is related to the fraction gk of k- 
degree super-individuals who were tagged as households 
in step (ii) of the algorithm: 



9k 



fk 



fk + k — kfk 



fk 



kgk 



gk + kgk' 



(2) 



In terms of fk we have the following relation between the 
degree distributions Pk and Pk of the super- and individ- 
uals graphs respectively: 



Pk 



Pk (1 - /fc + fk/k) 
E^=o^fe'(l-A'+/fe'A') 



(3) 



Trapman's original model [18|, [19| constrains the degree 
distribution of bachelors within the super-graph to match 
the distribution Pk of the individuals graph. This case 
corresponds to choosing fk to be independent of k, i.e., 
fk = F for constant F. As we show in subsequent sec- 
tions, many new phenomena arise when fk depends on 
k; we will refer to this case as the generalized Trapman 
model. 

The degree-dependent clustering coefficients in the 
final, individuals graph may be calculated by noting that 
each fc-degree individual is either a member of a single 
fc-clique (with probability fk) or is a member of no clique 
(with probability 1 — fk)- Since each node in a fc-clique 
has clustering level (fc — 2)/fc and nodes connected using 
the configuration model have effectively zero clustering 
level in the N —>■ oo limit (and assuming Pk has finite 
variance), the final average clustering for the fc-degree 
nodes in the individuals graph may be written as 



Cfc 



fk{k-2) 



for fc > 3. 



(4) 



Thus, given a desired degree distribution Pk and 
degree-dependent clustering coefficients Ck (for fc > 3), 
the set of fk values may be obtained from ([4]) with the 
degree distribution Pk and fractions gk for the super- 
graph of step (i) of the algorithm following from ^ and 
respectively. Therefore this algorithm can produce 
structured random graphs with almost any desired level 
of clustering (limited only by the constraint from ^ that 
Ck <{k — 2)/fc, to ensure fk < !)■ Moreover, this model 
gives analytically tractable results for a number of dy- 
namical processes on networks [22!|. Here we shall con- 
centrate on the bond percolation problem and the calcu- 
lation of fc-core sizes. In this context it is worth noting 
that our algorithm, which permits fc-degree nodes to be 
members of at most one fc-clique, can be viewed as a re- 
stricted version of Newman's bipartite graph model p^ . 
However, unlike Newman's model, we can specify the de- 
gree distribution Pk a priori. As noted above, our model 
is also analytically tractable for a variety of processes be- 
yond percolation. It must be recognized that the heavily 
intermittent clustering due to the fc-cliques gives a topo- 
logical structure that may be very different to a real- 
world network with the same Pk and Ck] nevertheless 



the model can give some useful insights into the effect of 
clustering on GCC and fc-core sizes in complex networks. 



III. BOND PERCOLATION THRESHOLD 
A. Calculating pc in clustered networks 

The giant connected component (GCC) of an infinite 
graph exists if Z2, the expected number of second neigh- 
bors of a random node, exceeds zi, the expected number 
of first neighbors 0. Note both zi and Z2 are evalu- 
ated on the damaged graph, i.e., after a fraction 1 — p 
of the links have been deleted. The lowest value of p for 
which zijzx = 1 therefore defines the bond percolation 
threshold pc- Here we use this criterion to determine the 
percolation threshold (epidemic threshold) in the indi- 
viduals graphs generated using the algorithm described 
in section HIl 

Note that a giant connected component can exist in 
the individuals graph only if the super-graph also has a 
GCC. It is therefore sufficient to determine a condition 
for the percolation transition in the super-graph, while 
correctly taking account of the internal fc-clique structure 
of the super- individuals which are tagged as households. 

The expected number of first neighbors in the damaged 
super-graph is zi = pz, where ? = kPk is the mean 
degree of the undamaged super-graph. To determine the 
expected number of second neighbors Z2 in the damaged 
super-graph, we first choose a super-individual at ran- 
dom. On average, this super-individual has zi first neigh- 
bors, with a given first neighbor being of degree fc with 
probability kPk Q . If this first neighbor is a bachelor 
(which occurs with probability \ —gk) then it connects on 
average to (fc — \)p super-individuals other than the orig- 
inal. If it is a household (with probability gk) then the 
connections to the (fc — \)p further super-individuals may 
be thwarted by deleted internal links within the fc-clique 
of individuals comprising the household. Thus household 
first neighbors connect on average to Dk{p) new neigh- 
bors, where Dk{p) is a polynomial in p which may be de- 
termined exactly by methods used in [T3| (see Appendix 
A), but whose values are bounded by 



< Dkip) < {k-l)p. 



(5) 



Combining the cases listed above, we write the ex- 
pected number of second neighbors in the damaged 
super-graph as 



fc 



Z2 



= S'l V ^Pk ((1 - gk){k - l)p + gkDkip)) , (6) 
L — ^ -/ 



k=l 



and so the bond percolation threshold Pc is the lowest 
value of p for which 22/^1 = 1, i-e. Pc satisfies the poly- 
nomial equation 

00 . 

V ^Pk ((1 - gk){k - l)pc + gkDkiPc)) = 1. (7) 
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Using equations ([3]) and ([2]) this condition may conve- 
niently be expressed in terms of the degree-distribution 
Pk of the individuals graph, and the fraction fk of k- 
degree individuals in cliques: 



Y^Pk{k{k-i)p,-k+ 

fk{k - 1 - k{k ~ l)p, + DkiPc))) = 0. (8) 



k=l 



This is a polynomial equation for the percolation thresh- 
old pc, and its solution requires calculation of the Dk{p) 
functions as specified in Appendix A. Note that if fk — F, 
a constant for all k, then this reduces to the criterion de- 
termined by Trapman's [l^ equation (14). Of particular 
interest is the relationship between pc and the percolation 
threshold in unclustered (configuration model) random 
networks with the same deg ree distribution Pk, known 
to be given explicitly by [l^l 



rand _ 



EkPk 



(k) 



Y.k{k-i)Pk {k^)-{ky 



(9) 



Here we have introduced the angle bracket notation to 
denote averaging with respect to the degree distribution 
Pk ■ In the remainder of this section we will examine the 
sign of Pc — pJJ^"'^ to determine whether the bond percola- 
tion threshold in the clustered network is greater than, or 
less than, the corresponding threshold in an unclustered 
network with the same degree distribution. 



B. Examples 

Figure [H shows the bond percolation threshold Pc cal- 
culated from equation ([8]) for networks with a Poisson 
degree distribution Pk = z^e~^ /k\. The log-log plots 
show Pc as a function of the mean degree z = (k), and 
for clique fractions fk of the form 



fk 



1 



for fc > 3, 



(10) 



with fk — ioi k < 3 (since fc-cliques only exist for 
k > 3). We show results for values of P ranging from 
(giving fk = 1 for all relevant /c) to /3 = 2 as de- 
scribed in the caption. Also shown (as a thick black 
curve) is the percolation threshold p™'^ — 1/z in the 
corresponding unclustered network. For all values of (3 
greater than zero, we find pc > p™'^ for small values of 
the mean degree z, but for sufficiently large z the clus- 
tered percolation point Pc becomes slightly less than the 
configuration model value p™'*^. Figure [Ifb) highlights 
this clustering-induced decrease of the threshold value 
by showing that the ratio Pc/pT'^'^ is (slightly) less than 
unity for the larger z values shown. 

Figure [3] shows pc values for the truncated power-law 
degree distribution 



Pk - 



Ak- 




, 3 < fc < fcmax 

, otherwise 



(11) 
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FIG. 2: (Color online) (a) Bond percolation threshold pc in 
clustered Poisson random graphs with mean degree z. The 
fraction ft of individuals of degree k which are members of 
households (fc-cliques) is fk = (2/(fe — 1))'' with (3 taking val- 
ues indicated in the legend. The thick black curve shows the 
percolation threshold p™"* in the unclustered {fk = 0) case, 
(b) The ratio Pc/pT^'^ highlights the decrease in the percola- 
tion threshold due to clustering when /3 is 1 or 2. 
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FIG. 3: (Color online) Bond percolation threshold pc for clus- 
tered networks with degree distribution Pk oc and cutoff 
degree fcmax. The fraction fk of individuals of degree k which 
are members of households (fc-cliques) is fk = (2/(fe — 1))*^ 
with P taking the values from to 2. The thick black 
curve shows the percolation threshold pJJ^"'^ in the unclustered 
[fk = 0) case. 
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for 7 = 2.5 and with the normahzation constant A cho- 
sen so that = 1 • The dependence as in Figure [H 
For convenience we have taken Pk — for k < 2; this 
choice ensures the undamaged graph is relatively well- 
connected, and in particular that for larger values of 7 a 
GCC exists in the unclustered network [11]. Note that 
here the results are presented as functions of the cutoff 
degree fcmax in order to highlight interesting behavior in 
the fcinax 00 limit of scale-free networks. The results 
for the power-law degree distribution are qualitatively 
similar to those for the Poisson degree distribution, i.e., 
in all instances, except the /3 = case of constant fk, the 
clustered networks show a decrease of Pc with increasing 
kmax- At large values of fcmax we see pc dipping below 
prand ^ greater extent in Figure [3] than for the Poisson 
degree distribution in Figure O In the /? = case of con- 
stant fk the clustered threshold Pc always exceeds p™""^; 
some implications of this arc considered in section IIII Dl 
below. 



1. Clustering increases the percolation threshold when fk is 
constant 

We first examine p- — p™*^ in the case where fk = F, 
a constant, for all fc > 3. The numerator of then 
simplifies to 

F - {kf + {k') (P2 -Po)-2 (fc) P2) . (15) 

For the power-law degree distribution pT|) we have Pk — 
for fc < 3, and so this expression reduces to Fvar(fc) 
where var(fc) is the variance (fc2) - {kf of the de gree 
distribution. Since this is positive for any fcmax > 3, we 
have proven that Pc > p^^"^ for constant fk in this case. 
Similarly, it can be shown that (|15p is positive, and hence 
Pc > p™*^: for the Poisson degree distribution. These 
results are consistent with the (] — results for pc (thin 
black lines) in Figures [2] and [Sj which never dip below 
the Pc^'^'^ values (thick black line) . 



C. Analytical bounds 

Some insight into these results may be gained by ex- 
amining explicit bounds for pc which may be obtained 
analytically from equation ([5]). Since Dk{p) is a mono- 
tone function of p, by replacing Dk (p) with its respective 
bounds from ([5]), we can solve ([8]) for lower and upper 
bounds p_ and p+ on the value of pc. Thus we obtain 
P- <Pc<P+, with 



P- 
P+ 



(fc(i - AO + fk) 



((fc-i)(fc(i 
(fc(i - fk) ^ 



-fk) 
fk) 



(fc(fc-i)(i -/,.))■ 



fk)) 



„rand 



(12) 



when fk 



Note that p- and p+ both reduce to p^ 
0. We now examine the quantities p- 
prand some spccific forms of the clique fractions fk- 
Of particular interest are cases where p- — p'^J^'^'^ can be 
shown to be positive, or where p+ — p™"'^ is negative. In 
the former case we obtain pc > P- > p^^'^'^, and so can 
guarantee that the presence of such clustering increases 
the percolation threshold above p^^"'^; in the latter case 
we similarly guarantee that Pc < p^^'^'^- After a little 
manipulation, we obtain the expressions 



„rand 



(fc) (fc(fc-i)/fc)^(fc^)((fc-i)A) 
(fc(fc_i))((fc_i)(fc(i_ A) + /,))^ 



(13) 



„ .and _ jk) {jk' - I) fk)^{k'){{k^ I) fk) 

^+ (fc(fc-l))(fc(fc-l)(l-/fe)) ■ ^ > 

As the denominators are manifestly positive, the signs 
of these expressions are determined by the signs of their 
respective numerators. 



2. Clustering decreases the percolation threshold if 
A = F/{k - 1) 

Next, we consider the numerator of p^ — p™"*^ for fk 
of the form F/{k — 1) for fc > 3, with F in the range 
< F < 2. The numerator of ^4]) then simplifies to 



f{ (fc)' + (k) - (fc') - {k){Pa + 2Pi + 3P2)+ 

(fc2)(Po+Pl+P2)). (16) 

For the power-law degree distribution (jlip this further 
reduces to (fc) — var(fc), and as fcmax 00 this certainly 
becomes negative. Specifically, for the exponent 7 = 2.5 
used in Figure [21 this bound guarantees that pc is less 
than p^^^'^'^ for fcmax > 13. This is consistent with the 
curve for /3 = 1 in Figure [31 For the Poisson distribution, 
the numerator simplifies to Fz^e~^/2 — however, as this 
quantity is positive we cannot draw any strong conclu- 
sions for this case. 

D. Scale- free networks 

A scale-free network (SFN) has degree distribution 
Pk oc fc"''' with 2 < 7 < 3 for sufficiently large fc. Net- 
works with such degree distributions may be generated 
by taking the limit fcmax ^ 00 of the truncated power- 
law networks introduced in equation (fTTj) . Of particu- 
lar interest is the bond percolation threshold pc which is 
known [12, 29, 30, 31, 32, 33, 34] to be zero for randomly 
wired (uncorrelated) SFNs. This can be seen from equa- 
tion ([9]): the second moment J2 k^Pk for SFNs is infinite 
while the mean degree z is finite, and so the denominator 
of the expression for p™"*^ grows without bound as the 
cutoff fcmax is increased, giving the result p'^J^'^'^ ^ as 
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lated (assortative) tree-like networks with scale-free de- 
gree distributions also have vanishing percolation thresh- 
old, and f]A\ and flS^ hypothesize that clustering cannot 
cause the percolation threshold to be non-zero. However, 
Trapman [l^ has applied his clustering model to note 
that if fk = 1 for all k then a non-zero bond percolation 
threshold is established even in scale-free networks. To 
see this result, it is convenient to express the lower bound 
p_ for the percolation threshold given in equation (jl2p in 
terms of the degree distribution of the super-graph, 
using equation ([3]): 



Y.Kk-l)Pk 



(17) 



This implies that the lower bound p_ for the percolation 
threshold in the individuals graph is equal to the per- 
colation threshold in the randomly-wired super-graph. 
In other words, the individuals graph can only possess 
a GCC if the super-graph also has a GCC. Now con- 
sider Trapman's example of a SFN where all fk are equal 
to one, with degree distribution (jlip and in the limit 
fcmax ~* oo. The super-graph degree distribution is then 
Pfe oc Pfe/fc (X k~'^~^. This degree distribution has finite 
variance for 7 > 2, and so it follows that the right hand 
side of p7|l is non-zero. In fact, we can explicitly evalu- 
ate p- to obtain the following bound on the percolation 
threshold: 



> 



1 



^(fc-i)fc-7 



1' 



(18) 



where z is the (finite) mean degree of the individuals 
scale- free network, 2 = ^Pk- 

It is worth pointing out that the mechanism described 
here for generating a non-zero percolation threshold in 
SFNs is distinctly different from those previously exam- 
ined for tree- like correlated networks [sHi, 2D lattice- 
embedded networks [36| . and for clustered growing net- 
works [Isi- AH of these examples are disassortative net- 
works, i.e., the average degree of neighbors of fc-degree 
nodes (fc)^^ is a decreasing function of k (with an asymp- 
totic constant value as fc — > 00 in the case of [13 )• By 
contrast, the individuals graph generated by Trapman's 
model with = P = 1 is strongly assortative, since 
high-degree nodes link almost exclusively to nodes of the 
same degree. Indeed, we show in Appendix B that the 
joint pdf P{k,j) of degrees of vertices at either end of a 
randomly chosen edge in the individuals graph is 



P{k,j)^^{Pk + {j~l)Sk,). 



(19) 



Hence the average degree of neighbors of nodes with de- 
gree k is (fc)jjjj = k — 1 + J and so increases linearly for 
large k. 

We also highlight the fact that the non-zero perco- 
lation threshold arising in the F = 1 Trapman model 
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Power Grid [8, 38] 


0.3580 


0.3739 


0.3645 


0.3483 


AS Internet [3^ 


0.0031 


0.0031 


0.0031 


0.0035 


Collaborations [40. 41] 


0.0273 


0.0279 


0.0279 


0.0380 


World Wide Web |42, 43] 


0.0020 


0.0020 


0.0020 


0.0036 


Router-Level Internet [44] 


0.0244 


0.0245 


0.0245 


0.0271 


PGP Network [45, 46, 47] 


0.0545 


0.0567 


0.0561 


0.0559 



TABLE I: (Color online) Values of the bounds p_ and p+ 
(using equation (112^ ). the bond percolation threshold in 
the clustered model network (using equation (O), and 
the randomly-wired percolation threshold p™'^ (using equa- 
tion @) for some real- world networks. The bounds are calcu- 
lated using the degree distribution Pk and clustering Ck (for 
> 2) of the real-world network, converting clustering to 
fc-clique fractions via equation l|20p . 



is due to the clustering, and not just a result of the 
degree-correlations induced by the clique structure. In- 
deed, consider a correlated but unclustered (tree-like) 
network with degree-degree correlations equal to those 
given by (fT9|l . and reintroduce the cutoff fcmax for the 
SFN degree distribution. The percolation threshold for 
such unclustered networks is known [sl, [s^ to be given 
by the reciprocal of the largest eigenvalue of the ma- 
trix C with entries Ckj = {j — l)zP{k,j)/{kPk). In the 
present case this threshold scales as k~l^ as fcmax 00, 
and so the correlated tree-like network has a vanishing 
percolation threshold. This is consistent with the be- 
havior of strongly assortative tree-like networks studied 
in [ssj , and shows that the finite threshold given by 
is directly attributable to the non-zero clustering in the 
Trapman model. Criteria for the existence of a finite 
SFN percolation threshold for non-constant fk will be 
reported elsewhere. 



E. Real-world networks 

In Table U we show the results of applying our model 
of clustering to some real- world networks. Given the de- 
gree distribution Pk and the degree-dependent clustering 
Ck of a real-world network, we choose fk values using 
equation (jH) so that the model network has a fc-clique 
structure which matches to Pk and (for all fc > 3, and 
provided Ck is not too large) to Ck'. 



fk 



min 1 



kck 
' k-2 



for fc > 3. 



(20) 



Using equations (|T2|) and ([9]) we calculate the bounds 
p_ and p+ for the percolation threshold in the clustered 
model network, as well as the threshold pjl^"^ for the cor- 
responding randomly-wired graph. In most cases (the 
PGP network being the exception) we can immediately 
see from the bounding values p_ and whether the clus- 
tered percolation threshold Pc will exceed p^J^'^'^ or not. 
For the power grid network we have p- > p^^^'^'^ and so 
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conclude that clustering increases the percolation thresh- 
old. For PGP the bounds are inconclusive, but calcula- 
tion using equation ([8|) confirms Pc > p^^'^'^ in this case 
also. For all other networks studied we find p^ < p'^J^'^'^, 
so that clustering decreases the percolation threshold. 

We obtain these results on pc under the assumption 
that the generalized Trapman model can describe the 
structure of real-world networks by matching the de- 
gree distribution and degree-dependent clustering. This 
is admittedly a rather strong assumption, and further 
verification is needed before these results can be consid- 
ered more just some interesting examples of applying the 
model. As percolation thresholds are defined only in the 
infinite system size limit — + oo, it is not possible to 
directly calculate percolation thresholds for (necessarily 
finite) real-world networks, but it appears from Figure 
7 of [ll| that the PGP network percolates for p < 0.05, 
whereas the Pc value we predict in Table |T] is substan- 
tially larger. We conclude that the Trapman model is 
not necessarily a good predictor of the percolation prop- 
erties of real-world networks, despite its ability to match 
the degree distribution and the clustering of the network. 

In summary, in this section we have derived the poly- 
nomial equation ([5]) for the percolation threshold Pc in 
the presence of clustering, and solved it numerically for 
some examples. Analytical bounds on the value of Pc 
have also been derived, and for the truncated power-law 
degree distribution clique fractions of the form fk ~ F 
and /fc = F/{k—l) have been respectively shown to guar- 
antee that Pc is greater than, or less than, the unclustered 
threshold value p™"*^ . Application of the model to some 
real-world networks yields examples where Pc < p^^'^'^ 
in some cases, with pc > pjj'^"'^ in others. In scale-free 
networks clustering with = 1 guarantees a finite per- 
colation threshold, in contrast to corresponding tree-like 
networks (even those with the same degree correlations) 
where the percolation threshold vanishes. 



IV. CALCULATING GCC SIZES 

In this section we develop an analytical approach to 
calculating the size of the giant connected component 
in the damaged individuals graph with bond occupation 
probability p. In an epidemiological context, the GCC 
size corresponds to the expected size of epidemic out- 
breaks in the population. Of particular interest is the 
effect of clustering on the epidemic size. 

Our method is based on a general formulation for cas- 
cade sizes on random networks, described in detail in 22]. 
We note that a generating function approach could also 
be used here, similar to J13'|, and such a method could 
yield the full distribution of connected component sizes. 
However our method has the advantage of being readily 
generalizable to the study of other cascade-type problems 
on networks, as we show in section |V] by using it to cal- 
culate the size of /c-cores in the clustered networks. The 
method is a generalization of the approach of Dhar et 



al. for the zero-temperature random-field Ising model on 
a network and has been successfully applied to cas- 
cade dynamics in various models [i^ [Ho] , including the 
calculation of fc-core sizes in correlated (but unclustered) 
networks ^22l] . 




FIG. 4: (Color online) Schematic showing parts of a tree ap- 
proximation for a super- individuals graph which is expanded 
to an individuals graph. Level n is occupied by a bachelor 
(left) and by a top node of the expanded household (right). 
Other members of the same household are located at an in- 
termediate level. 

Following the approach of [2^ . we approximate the 
randomly wired super-graph as a tree structure. This 
tree ansatz is commonly used for the configuration model; 
it assumes the absence of finite loops in the super-graph 
in the N —^ oo limit and allows only the infinite loops 
whose presence permits the use of mean-field theory [2]. 
Figure |4] shows part of such a structure, with the super- 
individuals now expanded to show the individual nodes 
which constitute households. We label the levels of the 
tree as shown, with each super-individual at level n hav- 
ing a single parent at level n -I- 1. Degree-fc bachelors 
at level n therefore have k — 1 children at level n — 1; 
degree- /j households at level n are considered to consist 
of a top individual (shown at level n), with the k — 1 
other individuals of the household drawn at an interme- 
diate level. Each of these k — 1 individuals has one child 
super- individual at level n — 1. 

The cascade-based approach to calculating the ex- 
pected size of the giant connected component is as fol- 
lows. Having chosen a value for the bond occupation 
probability p we damage the individuals graph by delet- 
ing each link between individuals with probability I — p. 
We label nodes which are part of a connected compo- 
nent of the graph as active, with the remaining nodes 
termed inactive. A random individual is selected as the 
top (i.e. root) of a tree, with his first neighbors on the 
next lower level, their neighbors at the following lower 
level, and so on. To determine the steady-state fraction 
of active nodes in the network, we must determine the 
probability that the individual at the top of the tree is 
active. Note all nodes in the tree are initially inactive, 
and that once a node is activated it cannot later become 
inactive. Starting at level (the bottom of the tree), we 
examine the propagation of activity from level n to level 
n + I, proceeding one level at a time and using the fact 
that nodes at level n + 1 are inactive until their children 
cause them to become active. 
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Define <?„ as the probability that a super-individual at 
level n is active [52i] . Similar probabilities may be defined 
separately for households and for bachelors; moreover we 
distinguish between super-individuals of different degree 

(k) 

k. Denote by 6„ the probability that a bachelor node of 

(k\ 

degree k at level n is active, and by hh, the probability 
that the top individual node in a household of degree k 
is active. Since a randomly-chosen super-individual con- 
nects to a super-individual of degree k with probability 
kPk/'z, we have the relation 



E 

fc=i 



(21) 



To determine h'^n+i ^"^^ ^li+i terms of qn we consider 
how the property of being active (i.e. being a member of a 
connected component) propagates from level to level. As 
we move focus from level to level, we need only consider 
the active fraction at level n to determine how many 
nodes at level n + 1 change from their inactive initial 
state. For bachelor nodes of degree k, we need consider 
only their fc — 1 children at level n. Each of the children is 
part of a connected component with probability qn and 
the link to this child is undamaged with probability p. 
The bachelor node becomes active if any one of the fc — 1 
links to level n yield an undamaged connection to an 
active child, thus we have the update rule [l^l 



L(fc) 



1 - (1 -pqn) 



(22) 



For households at level n + 1 we consider the situation 
of the top individual. Within the fc individual nodes of 
the household, the top individual is part of a connected 
cluster of TO individuals with probability P{m\k) (see Ap- 
pendix A). Each of the m — 1 other individuals within 
the household has one edge linking to level n, and so the 
probability that at least one of these will become active 
is 1 — (1 —pqn)™'^^- Summing over the possible values 
of TO, we obtain the probability of the top node of the 
household becoming active: 



C-/i = E^("^i^)(i-(i-^"?«)'"-') 



(23) 



m— 1 



Combining (I^T]) , and (|23p enables us to write a single 
update equation for g„ of the form qn+i — G{qn) with 



fc : 



G{q) = J2 -^fc 



{l-gk){l-il-pqr-') 



(24) 



, ^P(TO|fc)(l-(l-pq)™-l) 



9k 



Starting from an infinitesimally small positive value (e.g., 
qo = 1/N as N ^ oo), this equation is iterated to yield 
the steady-state solution goo corresponding to an infinite 
network. Finally, we consider the individual at the top 



(or root) of this infinite tree. Suppose the individual has 
degree fc (this happens with probability Pk) and so has fc 
children. With probability 1 — it is an individual who 
was a bachelor in the super-graph, and so is activated by 
its children with probability 1 — (1 —pqooY- Otherwise it 
is a member of a household of size fc, and so is part of a 
connected cluster of m individuals within this household 
with probability P(TO|fc). The whole cluster becomes ac- 
tive is any member of it has an undamaged link to an ac- 
tive child; this happens with probability 1 — (1 —pqoo)™- 
Putting together all the possibilities, we obtain an ex- 
pression for S, the expected size of the giant connected 
component: 



k=0 



(l-/,)(l-(l-pgoo)') 
fc 

+/,^P(TO|fc)(l-(l-pgoo)") 



m— 1 



, (25) 



where q^c is the steady-state of the iteration qn+i — 
G{qn) defined by equation (|24p . Indeed, the iteration 
process with infinitesimal q^ can be seen as a solution 
method for the self-consistent equation ^oo = G{qoo). 

Classical results on uncorrelated, unclustered networks 
are recovered by setting fk — gk — Q for all fc in equations 
and (P5|) : this reduction (via the notation mapping 
l^pq x) recovers, for example, equations (9) and (14) 

of ;23. 

Note that a general cascade condition (2^ for this sys- 
tem requires 



dG 
dq 



> 1 at g = 0, 



(26) 



in order that the initial iterations of the relation qn+i = 
G{qn) allow g„ to grow finitely large. The lowest value 
of p for which this condition holds defines the bond per- 
colation threshold Pc, and it is easy to check that this 
condition reduces to equation ([7]), which was derived us- 
ing more traditional arguments in section [TTTl 

Figures [U^a) and (Hb) show a comparison between the 
analytical solution (curves) and numerical computation 
of GCC sizes in networks generated using the algorithm 
of section [n] with N = 10^ individuals (symbols) [ssj . 
The degree distributions of the networks of Figure El^a) 
are Poisson (as in Figure [2]) with mean degree z = 3, 
while the networks for Figure [Ub) have a truncated 
power-law degree distribution pip with fcmax = 30 (cf. 
Figure [3]). Note that the values of the percolation thresh- 
old pc predicted in Figures [2] and [3] correspond to the p 
values where the GCC size becomes non-zero. For the 
Poisson case both cases with clustering have Pc larger 
than the unclustered value p^'^'^^, while the power-law 
case of Figure[5l^b) shows that pc may be larger or smaller 
than the unclustered value, depending on the form of the 
fc-clique fraction fk- The agreement between theory and 
numerical results is excellent. 
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FIG. 5: (Color online) Size of giant connected component S 
as a function of bond occupation probability in (a) clustered 
Poisson random graphs with mean degree z — 3, and in (b) 
clustered graphs with truncated power-law degree distribution 
Pk oc k~^'^ for 3 < < fcmax, with fcmax — 30 here. Symbols 
are the results of numerical simulations on a single network 
with N = 10^ individuals (averaged over 10 realizations of 
the percolation process) , and curves show the analytical result 
from equations (|2ip and (|25p . The fraction of individuals of 
degree k which are members of households (fc-cliques) is ft ~ 
(2/(fc — 1))*^ with P taking values indicated in the legend. The 
unclustered case (/^ =0) is also shown for comparison. 



V. CALCULATING K-CORE SIZES 

The A:-core of a network is the largest subgraph whose 
nodes have degree at least k. As discussed in pa.l2l.l26t. 
the size of the fc-core may be calculated as the steady 
state of a cascade process. We consider the nodes of the 
individuals network to have two possible states, labelled 
pruned and unpruned, and begin with all nodes in the 
unpruned state. In the first step of the cascade process 
for calculating the /c-core for k — K, all nodes with fewer 
than K neighbors are relabelled as pruned — these nodes 
cannot be part of the the K-coie. In each subsequent it- 
eration, any node with fewer than K unpruned neighbors 
is relabelled as pruned. In other words, a node of degree 
k becomes pruned if the number fc — m of its unpruned 
neighbors is smaller than K. In the steady-state limit of 



this cascade process, precisely those nodes in the iiT-core 
remain unpruned. 

The cascade-based approach of section IIVI can be ap- 
plied to calculate fc-core sizes in the clustered networks 
generated by the generalized Trapman model of sec- 
tionini Similar to the discussion preceding equation (PT|) . 
we begin with the creation of a tree whose top (or root) 
is a randomly selected node of the network. All nodes in 
the tree are initially in the unpruned state, and we exam- 
ine the propagation of the pruned fraction from level n 
to level n -|- 1 in the tree, proceeding one level at a time. 
Our goal is the determination of the probability that the 
top (or root) of the tree is pruned; this gives the final 
fraction of pruned nodes in the original network. We de- 
fine Qn as the probability that a super-individual at level 

(k) 

n is pruned. Similarly, denote by bn the probability 
that a bachelor node of degree fc at level n is pruned, 

(k) 

and by /i„ the probability that the top individual node 
in a household of degree fc is pruned. Equation (pij) of 
section IIVI then applies directly, and it remains only to 
define the updating rules for b';^J_^ and /i^+i. 

To this end it is convenient to introduce response func- 
tions Fh{m,k) and Fh{m,k) which respectively denote 
the probabilities that a fc-degree bachelor or a fc-degree 
household become pruned when they have m pruned 
neighbors. A bachelor becomes pruned when it has less 
than K unpruned neighbors, i.e. when k — m < K; oth- 
erwise it remains unpruned. Therefore the bachelor re- 
sponse function is given by (see equation (10) of [2^ 1) 



Fb{m, fc) 



1 ,k- m < K 
,k-m> K 



(27) 



For households we must take account of the fc-clique 
structure. First, \ik < K then every node in the fc-clique 
has less than K (unpruned) neighbors, and so the entire 
household is immediately pruned. Also, for the case fc = 
the whole household becomes pruned if any one of its 
neighbors is pruned, i.e. if m > 0, and remains unpruned 
otherwise. Finally, no node in a fc-clique can become 
pruned if fc > K, because in this case an individual of 
degree fc needs at least two pruned neighbors in order to 
become pruned itself, but each node in the fc-clique has 
only one external neighbor (and all nodes in the fc-clique 
are initially unpruned) . Thus it is straightforward to see 
that 



Fh{m, fc) 



Fb{m,k) ,k<K 
,k> K 



(28) 



Next, since each child at level n is independently 
pruned with probability a bachelor or household of 
degree fc has exactly m out of fc — 1 children pruned with 
probability ('^~^^)(7™(1 — qn)''^^^"\ Therefore, summing 
over every possible number of pruned children m gives 
the probability that a bachelor node of degree fc at level 
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n is pruned: 



where 



fc-1 

E 



k - 1 



Fb{m,k), (29) 



and a similar expression for a household can be written 
using equation ([28]) as 



'71+ 1 



n+ 





,k> K 



(30) 



Note that b^^^-^ and h'^J^^-^ can also be written in a less 
general form without the use of response functions as 

1 ,k<K 

'=;/)g™(l-g„)'~'~™ ,k>K ' 

(31) 



■'n+l 




and 



"-n+l ^ 



1 

1 - (1 - 9r 




,k<K 
,k> K 



(32) 



Using the update rules and ([50]) (or alterna- 

tively (|3ip and ((32)) ) in conjunction with pTjl enables 
us to iterate from an infinitesimally small positive qo to 
the steady state goo corresponding to an infinite network. 

Finally, consider the individual at the top (or root) of 
the infinite tree, assuming it has degree k, i.e., k chil- 
dren. With probability 1 — /fc it was a bachelor in the 
super-graph, and by analogy with (|29p is pruned with 
probability 



(fc) 
Pb 



E 

m=0 



'Fi,{m,k). 



(33) 



Similarly, if the individual is in a household (with prob- 
ability /fc), it is pruned with probability 



[k] 



(k) 
Pb 



k<K 
,k> K 



(34) 



Then the final density of pruned nodes in the individuals 
network is given by (cf. equation (I25p ) 



P - 



OO 

^P,.[(l-A)pf +/; 



(k) 
kPh 



k=Q 



(35) 



and the fractional size of the fc-core ioi k ~ K is given 
by 1 - p. 

We can combine equations (|2T|) . (|29|) and ((30)) to give 
an explicit self-consistent equation for qao'. 



H{qoo) 



fe-1 



E 1^'^ E r ' - loof-'-WkFbim. fc), 



fc=l 



(36) 



1 ,fc < X 
1 - 5fe , fc > X 



(37) 



The iteration process for qn starting from infinitesimal 
go converges to the lowest solution of the self-consistent 
equation ((5^. 

The analysis of section IV of [23| may be applied here 
to provide an interpretation for q^o in terms of measur- 
able quantities on the network. Let Lk be the number of 
edges in the super-graph which connect two individuals 
belonging to the iiT-core, and let L be the total number 
of edges in the super-graph. Then, as shown in Appendix 
C, 



L 



(38) 



i.e., the quantity qoo is related to the fraction of super- 
graph edges which link individuals in the K-coie. 

In the limit of zero clustering (/^ = 5fc = for all 
k) , equations ((35|) and ((36|) reduce to existing results for 
A;-cores on (undamaged) configuration model networks, 
as in equations (1) and (2) of [26] via the mapping of 
notation g i?, p i— > 1 — M, see Appendix D. 

Figures (H^a) and [6jb) show comparisons between the 
theory and numerical calculations of fc-core sizes on clus- 
tered networks generated by the algorithm of section [Til 
Figure ID^a) is for a network with Poisson degree distri- 
bution with z = 3 (cf. Figure a)). The unclustered 
(/fc — 0) case has no fc-cores for fc > 2, but the presence 
of cliques leads to non-zero fc-core sizes for all fc with 
//c > 0. Since we use finite-size graphs we cannot nu- 
merically resolve fc-cores of fractional size smaller than 
1/A^ {N = 10^ here) but the agreement between theory 
and simulation is excellent for K up to approximately 
10. A network with truncated power-law degree distri- 
bution ((TT|) with fcinax = 30 (cf. Figure (Hl^b)) has fc-core 
sizes as shown in Figure jS^b). Again, non-zero cluster- 
ing leads to non-zero fc-core sizes for all K up to fcmax, 
and agreement between theory and numerics is excellent 
except for finite size effects upon very small fc-cores. 



VI. CONCLUSIONS 

We have shown that a generalization of the Trapman 
model [H, [l^ of clustered clique-tree networks has sev- 
eral analytically tractable features. These include the 
ability to calculate the bond percolation threshold, size 
of the giant connected component, and sizes of fc-cores. 
The algorithm for generating realizations of model net- 
works is described in section [ill The degree distribution 
Pk of the network is specified, along with the fraction fk 
of fc-degree nodes residing in fc-cliques. The parameters 
fk are related to the degree-dependent clustering coeffi- 
cients Cfe by equation and so allow us to tune the 
level of clustering in the network. 
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FIG. 6: (Color online) K-core sizes in (a) clustered Poisson 
random graph with mean degree z — 3, and in (b) clustered 
graphs with truncated power-law degree distribution Pk oc 
k~^'^ for 3 < < femax, with femax = 30 here. Symbols are 
the results of numerical simulations on a single network with 
A'' — 10^ individuals, and curves show the analytical result 
from Eq. (|35p . The fraction ft of individuals of degree k which 
are members of households (fc-cliques) is fk = (2/(fc — 1))'' 
with /3 taking values indicated in the legend. The unclustered 
case {fk =0) is also shown for comparison. 



The main analytical results are equation ([5]) for the 
bond percolation threshold, and the iteration schemes of 
sections HV] and |V] (see equations (|25|) and p5|) ) for the 
sizes of the giant connected component and /c-cores, re- 
spectively. The percolation threshold Pc is determined by 
solving the polynomial equation ([5]) , see Figures [5] and [3] 
for examples. We have also examined explicit upper and 
lower bounds for pc (see section IIII Cp . Of particular 
interest is the relationship between pc and the percola- 
tion threshold pJJ^"*^ in a randomly-wired (unclustered) 
network with the same degree distribution (although we 
also give some results for the degree correlations, see sec- 
tion IIIIDI and Appendix B). Our results indicate that 
for a given level of clustering within this class of struc- 
tured random networks, Pc may be greater than, or less 
than, p™*^, depending of the degree distribution of the 
network. This contrasts with the results of [ll[, where 
weakly clustered networks (with Ck < l/(fc — 1)) have 



Pc > pI'^'^^, while in the strongly clustered case with 
Ck > — l)j the clustering decreases the threshold, 
so Pc < p™'*^. Indeed, we show in section UlI C II that 
the Trapman model with fk = F, a constant for all 
k, leads to clustering increasing the percolation thresh- 
old: Pc > Pc^'^'^, whereas the classification of this case as 
strongly clustered according to fTT!| (since = F{l~2/k) 
here) would predict the opposite conclusion. 

Similarly, Figure [3] gives clear examples of cases (e.g. 
/3 = 2) where Ck < l/(fc - 1), but the result of Pc < p^'"''* 
is the opposite to that predicted by [ll| for the weakly 
clustered case. These contradictions to the results of [ll| 
are not surprising when we consider that the approach 
of [ll| is focussed on clustering due to loops of length 
three (i.e. triangles) in the graph. Indeed, the authors 
of [ij] carefully point out that they do not consider effects 
of longer loops. By contrast, the clustering within the 
Trapman model is more heavily localized, since a node 
of degree k which is a member of a triangle must also be 
part of a loop of length n for all n from 3 to k. Therefore 
we should not expect the theory of llj to apply to the 
Trapman model; nevertheless it is instructive to find that 
model networks with the same degree distributions Pk 
and clustering coefficients Ck can give opposite results for 
this important question. Higher order information, e.g. 
some measure of the density of loops of length greater 
than three (5l| . is required to distinguish the two types 
of networks from each other. 

The model of clustering described here has the impor- 
tant advantage of analytical tractability, permitting us 
to calculate the bond percolation threshold and sizes of 
/c-cores and giant connected components. However, the 
model is limited in its applicability to real-world net- 
works by the rather artificial structure of clustering us- 
ing fc-cliques, which is not expected to be the dominant 
form of triangle-formulation within most real- world net- 
works. Bearing in mind this caveat, we use the Pk and 
Ck parameters of some real- world networks (see Table |l| 
to find the values of Pc predicted by equation ([5]). In 
some cases (power grid, PGP) we find Pc > p^'*"'^, while 
in others (e.g. Internet, WWW) the opposite conclusion 
is reached. The applicability of this and related models 
to real-world networks will be the topic of further study. 
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APPENDIX A: CLIQUE CALCULATIONS 

Newman gives results relevant to the bond perco- 
lation problem on a fc-clique, i.e. a complete graph of 
k nodes. Here we briefly review these results and show 
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they can be applied to calculate connectivity properties 
of the individuals graph. 

For bond occupation probability p, the damaged k- 
clique may consist of a number of disconnected clusters 
of nodes. Letting P{m\k) be the probability that a ran- 
domly chosen node in the damaged fc-clique belongs to 
a connected cluster of m nodes (including itself), equa- 
tion (7) in [13i] gives 



P(m|fc) = 



' k-1 
I m - 1 , 



(1 (39) 



The probabilities P{m\m) may be determined iteratively 
from the relation 



fe-i 



P{k\k) = 1- XI PMI^)^ 



(40) 



with P(l|l) = 1. Consider an individual A in a dam- 
aged household of k individuals. We seek the number 
of external super-individuals which are connected to A 
via undamaged paths through his household — note we do 
not count A's own direct external link. The individual A 
is connected to m — 1 other individuals in the household 
with probability P{m\k), and each of these other individ- 
uals has a single link external to the household, which is 
undamaged with probability p. Thus the average number 
of undamaged external links from the connected cluster 
(and hence from A) to other super-individuals is 



(41) 



m=l 



The polynomials Dk (p) for some low values of k are given 
below: 



Dsip)^2p^{l+p~p^) 
Diip) = 3p^{l + 2p~7p^ 



7p' - 2p5) 



Dsip) = 4p2(i + 3p + 3p^ - 15p^ - 27/ + 127p^ 
- 175/ + 120p^ - 42/ + 6/; 



(42) 



APPENDIX B: DEGREE-DEGREE 
CORRELATIONS 

We consider the calculation of P(fc,j), the joint pdf 
of degrees of vertices at either end of a randomly cho- 
sen edge in the individuals graph, for the special case 
oi fk = F — 1 for all k, and with Pk = for fc < 3. 
We begin by noting that the number of edges in the 
super-graph is Nz/2, and each of these also exists in 
the individuals graph as an external edge joining two in- 
dividuals in different households. Since F — I, every 
super-individual of degree fc is a household, and so is ex- 
panded in the individuals graph to a fc-clique — this adds 



a total of NJ2k Pkk{k — l)/2 further edges to the indi- 
viduals graph. Therefore, a randomly chosen edge in the 
individuals graph is an external edge with probability 



(43) 



and using equation ([3]) with = 1 (and Pk — for 
fc < 3) reduces this to 1/z. 

An external edge has end-vertex degrees k and j with 
probability 



kPk jPj 

z z 



(44) 



since the super-graph is an uncorrelated random graph. 
An internal edge is in a j-clique with relative probability 



P^U - l)/2 
Efe,Pfe'fc'(fc'-l)/2 



1 



(45) 



and its end- vertex degrees are both equal to j. Combin- 
ing all the possibilities, we obtain equation (fT9|) : 



P{k,]) 



1 



-PkP, 



1 - 



1\ (.7-1)^. 



z - 1 



-4,- (46) 



The average degree of neighbors of nodes with degree k 
is then 



(fc)„„ - 



Y.jP{k,])3 
j:,P{k,j) 



(47) 



APPENDIX C: RELATION BETWEEN ORDER 
PARAMETER AND EDGE STATISTICS 



Following [2J], we derive here equation (|38p for the 
fraction of edges in the super-graph which link two un- 
pruned super-individuals, i.e. super-individuals belong- 
ing to the ii'-core. Note from the discussion preceding 
equation (p8|) that all individuals of a household are in 
the same state and so we may speak of super-individuals 
as pruned or unpruned. 

Consider the super-graph where the cascade has ended 
and all the nodes in the graph have been updated. Let 
us first calculate Lk, the number of edges in the super- 
graph which connect unpruned super-individuals. Taking 
all super-individuals one by one and counting links to 
any of their unpruned neighbors (if the chosen super- 
individuals is itself unpruned) will give 2Lk. 

In order to calculate the expected value of this quan- 
tity we consider a randomly-chosen super-individual of 
the super-graph. Taking this as the root of the tree ap- 
proximation of the super-graph, we suppose it has degree 
k and m < k pruned children. The probability that m of 
its k children are pruned (meaning that k — m children 
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are unpruned) is — 9oo)'^~'", where goo is the or- 

der parameter given by the solution of the self-consistent 
equation ((36|) . 

The state of the root depends on the state of its chil- 
dren as follows. The root can be either a bachelor (which 
happens with probability I — gk) or a household (which 
happens with probability gk). In each of these cases 
it is respectively pruned with probability Fh(rn,k) and 
Fh{m,k), which are given by equations ((27)) and ((28)) . 
Therefore, the probability that the root chosen at ran- 
dom is pruned when it has m pruned children is given by 
a weighted sum of probabilities 

F(m, k) = {l- gk)Ft{m, k) + gkFh{m, k) (48) 
= VKfeFb(m,fc), 

where Wk is defined by equation ((37)) . 

Combining the probabilities together, the expected 
number of edges linking an unpruned root of degree k 
to its unpruned children is 



k 

E 

m=0 



{k-m)[ ^ ]qZ{l-qoo) 
m! 



k—7n 



1 -F(m,fc)) , (49) 



where the (fc — m) factor counts the unpruned children, 
given that m of the k children are pruned, while the 
(1 — F{m, k)) term accounts for the root node being un- 
pruned. Averaging this over the degree distribution of 
the super-graph and multiplying by the number of nodes 
gives 



2Lk 
N 



fc=0 m=0 



Pk Y.^k-m) i^Jj C(l - q^f-^ (l - F(m, k)) 

(50) 



The fraction of edges in the super-graph linking unpruned 
super-individuals is found by dividing the right hand side 
of ((50)) by the total number of edges in the super-graph 
L = N'z/2, to obtain 

Lk_ ^ 
L 

E f E (fc - -) - ^oof- (l - F(m, k)) . 

k=0 m=0 ^ ^ 

(51) 



Using the identity (fc — w)(m) = ^ ^^'^ factoring 

out (1 — ^oo), this can be written as 



(1 - Qoo) X 
k-1 



fc=0 m=0 ^ ^ 

(52) 



Finally, rewriting the last expression as 



(1 - qoc ) X 



°° kPk ^ //c - 1 



z '■ — ' \ m 



C(l-goo)'=^^-"F(™,fc)j , 

(53) 



and using ((36)) gives (1 — (?oo)^- Equation ((38)) of the main 
text follows immediately. 



APPENDIX D: ZERO-CLUSTERING LIMIT OF 
K-CORE SIZE 



In the unclustered case the self-consistent equation 
((5S)) reduces to Qoo = -ff('Zoo), with H{qoo) given by 



^-^ z ^-^ \ m 

k=l m=0 ^ 



C(l-'7oo)'=-^-"i^b(m,fc), (54) 



where Fb{m, k) = lifm> k — K and zero otherwise. We 
show that the right-hand side of this equation is the same 
as in equation (2) of [1^ in the undamaged networks case. 

The sum over k is first expressed as a sum over z, with 
i = fc - 1: 



1=0 



5]^^P.+i^ ( JC(l~9=o)^-"^6(m,z-f 1). 



m=0 



(55) 

Next, the sum over m is re-ordered to a sum over n, with 
n = i — m, and using the fact that (^) = (^) : 



(56) 

The double sum X^i^o Sn=o rewritten as 

J2^=o J2iln' ^'^^ using the fact that Fh{i — n,i + 1) is 1 
only for n < if — 1 we obtain 



K-2 oo . . ^ . 
n—O i—n 



This, with the notation mapping goo R, gives equation 
(2) of [26] (withp = 1). Similar manipulations reduce the 
zero-clustering version of equation psp to equation (1) of 
[2^, with the notation mapping p ^ 1 — M. 
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